Name: Hafiz Muhammad Hassan

Matricula: 1873829

  1. Illustration of the dataset:

Dataset Download from: https://raw.githubusercontent.com/GTPB/PSLS20/master/data/fev.txt

In this dataset, we have FEV of 606 children, between the ages of 6 and 17, were measured. The dataset also provides additional information on these children: their age, their height, their gender and, most importantly, whether the child is a smoker or a non-smoker.

Features:

age: This feature is the age of the person in numeric digits.
height: It is the height of the patient in inches.
gender: Male or Female represented as ‘m’ and ‘f’ respectively
smoking: Whether person/child smokes or not in 0 and 1’s.
fev: The fev, which is an acronym for forced expiratory volume, is a measure of how much air a person can exhale (in litres) during a forced breath.This is the actual feature we will create model for.

Before loading the dataset I will load all the libraries.

require(tidyverse)
require(magrittr)
require(R2jags)
require(mcmcse)
require(bayesplot)
require(TeachingDemos)

Reading the dataset in FEVdata object.

FEVdata <- read.table(file="fev.txt",header=T, sep="\t")
head(FEVdata)
##   age   fev height gender smoking
## 1   9 1.708   57.0      f       0
## 2   8 1.724   67.5      f       0
## 3   7 1.720   54.5      f       0
## 4   9 1.558   53.0      m       0
## 5   9 1.895   57.0      m       0
## 6   8 2.336   61.0      f       0

Before doing anything lets just convert gender to 0 or 1’s. 0 repersents male and 1 represent female

FEVdata$gender = ifelse(FEVdata$gender=="m", 1, 0) # keep in mind we are replacing the original gender from character to 0 and 1s

Analyzing Dataset

Plotting between age and fev:

plot(x=FEVdata$age,y=FEVdata$fev,xlab = "Age",ylab = "FEV", col= c("darkgreen"))

We realized that fev increases linearly with the age.

Plotting between height and fev:

plot(x=FEVdata$height,y=FEVdata$fev,xlab = "Height",ylab = "FEV", col= c("darkblue"))

We realized that fev increases linearly with the height.

Plotting between gender and fev:

plot(x=FEVdata$gender,y=FEVdata$fev,xlab = "Female - Male",ylab = "FEV", col= c("darkgreen"))

We realized that male generally have more fev than female.

Plotting between smoking and fev

plot(x=FEVdata$smoking,y=FEVdata$fev,xlab = " Non-Smoking - Smoking",ylab = "FEV", col= c("darkblue"))

From above plot we didn’t get anything for now. Lets see we can conclude something here or not.

si = 0
ni = 0
smokers = 0
non_smokers = 0 
n = length(FEVdata$smoking)
for(i in 1:n) {
  
  if(FEVdata$smoking[i] == 1)
  {
    smokers[si] = FEVdata$fev[i]
    si = si + 1
  }
  else
  {
     non_smokers[ni] = FEVdata$fev[i]
      ni = ni + 1
  }
}
mean_of_smoker = mean( smokers)
mean_of_nonsmoker = mean(non_smokers)

cat("Mean of Smoker:",mean_of_smoker,fill = TRUE)
## Mean of Smoker: 3.261483
cat("Mean of non-Smoker:", mean_of_nonsmoker)
## Mean of non-Smoker: 2.636331

We can see after calculating the mean that mean of smoker is higher than mean of non-smoker. It means that smoker has higher tendency of having higher fev

Now we plot the distribution of fev

hist(FEVdata$fev,breaks = 50,xlab = "FEV Distribution")

we can see from above that most children’s fev stays between 2-3.

  1. Explain the overall features of the statistical model such as the role of the parameters and the inferential goals of the analysis

We observe that fev follows normal distribution so we try to develop its distribution.

We will create a model in jags and analyze it. Lets only include Age and Smoke features and analyze them.

\[y \sim N(\mu,\tau) \\ \mu= \beta_1 + \beta_2(Age)+\beta_3(Smoke) \\ Prior\ Distribution \ of \ parameters \\ \beta_1 \sim N(0,0.001) \\ \beta_2 \sim N(0,0.001) \\ \beta_3 \sim N(0,0.001) \\ \tau \sim G(0.001,0.001)\]

model1.jags <- function() {
  # Likelihood
  for(i in 1:n){
    fev[i] ~ dnorm(mu[i],tau)
    mu[i] <- beta[1] + beta[2]*age[i] + beta[3]*smoke[i]
  }
  beta[1] ~ dnorm(0,0.001)  # Diffuse prior
  beta[2] ~ dnorm(0,0.001)
  beta[3] ~ dnorm(0,0.001)
  tau ~ dgamma(0.001,0.001) #gamma here is important
}

The Parameter \(\tau\) represents the variation in the data of fev. The Parameter \(\beta_1\) is linear transformation parameter used to balance out the mean of data. The Parameter \(\beta_2\) gives the weight to age variable, the larger this parameter means fev depends more on value to age. The Parameter \(\beta_3\) gives the weight to Smoke variable, the larger this parameter means FEV depends more on value to smoke.

From above we have the Distribution of the FEV and prior distribution we can run MCMC on the data and evaluate the parameters For initial point in Markov chain we set All \(\beta=0\) and \(\tau=1\) and we run 9000 thousands MCMC itrations with 3 chains

Lets prepare data for jags

# Preparing data for JAGS
n <- length(FEVdata$age)

fev <- FEVdata$fev
smoke <- FEVdata$smoking
age <- FEVdata$age

#Lets only include Age and Smoke features and analyze them
dat.jags <- list("n","fev","smoke", "age")


# Defining parameters of interest
mod.params <- c("beta","tau") 

# Starting values
mod.inits <- function(){
  list("tau" = 1, "beta" = c(0,0,0))
}
# Run JAGS
set.seed(1873829) # adding my matricula
mod1.fit <- jags(data = dat.jags,                                        # DATA
                model.file = model1.jags, inits = mod.inits,                  # MODEL
                parameters.to.save = mod.params,                  
                n.chains = 3, n.iter = 9000, n.burnin = 1000, n.thin=10)# MCMC
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 606
##    Unobserved stochastic nodes: 4
##    Total graph size: 1860
## 
## Initializing model
mod1.fit
## Inference for Bugs model at "/var/folders/yy/6ydy4yyd44zclz1dgg2076lw0000gn/T//RtmphrLy0P/model639d37ea12ba.txt", fit using jags,
##  3 chains, each with 9000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 2400 iterations saved
##           mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat
## beta[1]     0.208   0.103    0.004    0.137    0.210    0.276    0.403 1.002
## beta[2]     0.248   0.010    0.228    0.241    0.247    0.255    0.267 1.002
## beta[3]    -0.236   0.084   -0.402   -0.293   -0.236   -0.179   -0.068 1.002
## tau         3.135   0.182    2.788    3.010    3.133    3.256    3.497 1.002
## deviance 1030.038   2.928 1026.479 1027.888 1029.414 1031.413 1037.537 1.003
##          n.eff
## beta[1]   1800
## beta[2]   1900
## beta[3]   1100
## tau       1100
## deviance  1100
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 4.3 and DIC = 1034.3
## DIC is an estimate of expected predictive error (lower deviance is better).
chainArray <- mod1.fit$BUGSoutput$sims.array

3.illustrate the main inferential findings (Bayesian point and interval estimation, hypothesis testing)

From above we get the beta-1 as 0.208, beta-2 value as 0.248, beta_3 as -0.236 and tau value as 1034.3. pD is 4.3 panalization value and DIC measure of goodness of fit throughtout different chains which has 1034.3.

lets get the estimates bayesian point and interval estimation of our model.

chainMat <- mod1.fit$BUGSoutput$sims.matrix
# Intervals
cred <- 0.95

cat("Here are pont estimates or means of different parameters:", fill = TRUE)
## Here are pont estimates or means of different parameters:
# Point estimates
(par.hat.jags <- colMeans(chainMat))
##      beta[1]      beta[2]      beta[3]     deviance          tau 
##    0.2076090    0.2475994   -0.2357305 1030.0375901    3.1349292
cat("", fill = TRUE)
cat("Here are equal tail intervals of our parameters we estimated:", fill = TRUE)
## Here are equal tail intervals of our parameters we estimated:
# Intervals
(par.ET.jags <- apply(chainMat, 2, quantile, 
                    prob=c((1-cred)/2, 1-(1-cred)/2)))
##           beta[1]   beta[2]     beta[3] deviance      tau
## 2.5%  0.004133159 0.2283579 -0.40248049 1026.479 2.787757
## 97.5% 0.402949602 0.2673897 -0.06804479 1037.537 3.497325
cat("", fill = TRUE)
cat("Here are HPD intervals using coda of our model:", fill = TRUE)
## Here are HPD intervals using coda of our model:
# What about the HPD?
(par.HPD.jags <- coda::HPDinterval(as.mcmc(chainMat)))
##                  lower         upper
## beta[1]     0.01211304    0.40702379
## beta[2]     0.22886315    0.26778785
## beta[3]    -0.41606811   -0.08851262
## deviance 1026.11503822 1035.90043542
## tau         2.76242954    3.47113071
## attr(,"Probability")
## [1] 0.95

4.Illustration of the features of the MCMC convergence diagnostics and error control.

Lets do some dignostics using BayesPlot.

# Plots with BayesPlot
chainArray <- mod1.fit$BUGSoutput$sims.array
bayesplot::mcmc_combo(chainArray) # combo of density plot 

We can see from bayesplot above that chains values of each parameters. It seems like our model is converged because different chains are exploring the same kind of area.

bayesplot::mcmc_acf(chainArray)

From above we see that our model is started from 1 and went quickly towards 0 or near zero for each parameters in chain.

Lets use coda now because it will provide us with geweke, gelmen and heidel diagnostics

coda.fit <- coda::as.mcmc(mod1.fit)
coda::geweke.diag(coda.fit) # 
## [[1]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##  beta[1]  beta[2]  beta[3] deviance      tau 
##  -1.0803   1.0386   0.7099   0.1273   1.9995 
## 
## 
## [[2]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##  beta[1]  beta[2]  beta[3] deviance      tau 
##   0.4579  -0.1715  -1.2364   1.4817   1.5684 
## 
## 
## [[3]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##  beta[1]  beta[2]  beta[3] deviance      tau 
##   1.2742  -1.1209   0.3636   0.1248   1.2609

Geweke basically comparing mean of 1st 10% of chain to mean of last 50% of the chain if they are both kind of equal we surely converged. Lets plot it here:

coda::geweke.plot(coda.fit)

coda::gelman.diag(coda.fit)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## beta[1]           1       1.01
## beta[2]           1       1.01
## beta[3]           1       1.01
## deviance          1       1.00
## tau               1       1.00
## 
## Multivariate psrf
## 
## 1
coda::gelman.plot(coda.fit)

This is the potential scale reduction factor that I want to be near 1 or maybe under is better at the same time. The plots suggest to have the median (black line) under the 97.5% of quantile (red line).

lets now run the heidel diagnostic to anaylze that.

coda::heidel.diag(coda.fit)
## [[1]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed        1        0.749  
## beta[2]  passed        1        0.753  
## beta[3]  passed        1        0.906  
## deviance passed        1        0.699  
## tau      passed       81        0.128  
##                                      
##          Halfwidth Mean     Halfwidth
##          test                        
## beta[1]  passed       0.209 0.007196 
## beta[2]  passed       0.248 0.000706 
## beta[3]  passed      -0.232 0.005729 
## deviance passed    1030.057 0.206820 
## tau      passed       3.140 0.013204 
## 
## [[2]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed       1         0.6800 
## beta[2]  passed       1         0.6779 
## beta[3]  passed       1         0.7395 
## deviance passed       1         0.0801 
## tau      passed       1         0.4107 
##                                      
##          Halfwidth Mean     Halfwidth
##          test                        
## beta[1]  passed       0.211 0.007412 
## beta[2]  passed       0.247 0.000732 
## beta[3]  passed      -0.234 0.005932 
## deviance passed    1030.183 0.209045 
## tau      passed       3.125 0.012588 
## 
## [[3]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed       1         0.646  
## beta[2]  passed       1         0.570  
## beta[3]  passed       1         0.825  
## deviance passed       1         0.435  
## tau      passed       1         0.103  
##                                      
##          Halfwidth Mean     Halfwidth
##          test                        
## beta[1]  passed       0.203 0.007467 
## beta[2]  passed       0.248 0.000661 
## beta[3]  passed      -0.241 0.005802 
## deviance passed    1029.874 0.192069 
## tau      passed       3.137 0.012560

Heidel check the model in two steps first it check whether chain is stationary. From above we can see that chain seems to passed all stationary test. The second part basically checks that whether accuracy is good enough comparing it to the half width of chain.

  1. discuss one possible alternative statistical model and illustrate results of model comparison through DIC and/or marginal likelihood (see also Chapter 11 in Ntzoufras (2010))

lets create another model with slight modification and compare its DIC to one we have already created.

\[y \sim N(\mu,\tau) \\ \mu= \beta_1 + \beta_2(Age)+\beta_3(Smoke)+\beta_4(Age)(Smoke) \\ Prior\ Distribution \ of \ parameters \\ \beta_1 \sim N(0,0.001) \\ \beta_2 \sim N(0,0.001) \\ \beta_3 \sim N(0,0.001) \\ \beta_4 \sim N(0,0.001) \\ \tau \sim G(0.001,0.001)\]

In this model, the Parameter \(\beta_4\) gives the weight to product of smoke and age variables, the larger this parameters means fev depends more on product of Smoke and Age

Here is model in jags:

model2.jags <- function() {
  # Likelihood
  for(i in 1:n){
    fev[i] ~ dnorm(mu[i],tau)
    mu[i] <- beta[1] + beta[2]*age[i] + beta[3]*smoke[i] + beta[4]*age[i]*smoke[i]
  }
  beta[1] ~ dnorm(0,0.001)  # Diffuse prior
  beta[2] ~ dnorm(0,0.001)
  beta[3] ~ dnorm(0,0.001)
  beta[4] ~ dnorm(0,0.001)
  tau ~ dgamma(0.001,0.001)
}

Lets run the model with same data we have already prepared in model 1.

# here we have to initialize the model beta_4 value equal to zero as well. 
mod2.inits <- function(){
  list("tau" = 1, "beta" = c(0,0,0,0))
}
set.seed(1873829)
mod2.fit <- jags(data = dat.jags,                                        # DATA
                model.file = model2.jags, inits = mod2.inits,                  # MODEL
                parameters.to.save = mod.params,                  
                n.chains = 3, n.iter = 9000, n.burnin = 1000, n.thin=10)# MCMC
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 606
##    Unobserved stochastic nodes: 5
##    Total graph size: 1882
## 
## Initializing model
mod2.fit
## Inference for Bugs model at "/var/folders/yy/6ydy4yyd44zclz1dgg2076lw0000gn/T//RtmphrLy0P/model639d231dcd24.txt", fit using jags,
##  3 chains, each with 9000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 2400 iterations saved
##           mu.vect sd.vect    2.5%      25%      50%      75%    97.5%  Rhat
## beta[1]     0.064   0.102  -0.130   -0.006    0.063    0.134    0.265 1.000
## beta[2]     0.262   0.010   0.242    0.255    0.262    0.269    0.281 1.001
## beta[3]     2.259   0.484   1.306    1.943    2.244    2.576    3.229 1.001
## beta[4]    -0.193   0.037  -0.266   -0.216   -0.192   -0.168   -0.121 1.001
## tau         3.262   0.186   2.908    3.135    3.261    3.387    3.639 1.001
## deviance 1004.046   3.198 999.821 1001.664 1003.384 1005.712 1011.753 1.001
##          n.eff
## beta[1]   2400
## beta[2]   2400
## beta[3]   2400
## beta[4]   2400
## tau       2400
## deviance  2400
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.1 and DIC = 1009.2
## DIC is an estimate of expected predictive error (lower deviance is better).
chainArray <- mod2.fit$BUGSoutput$sims.array

Through our DIC comparison we can see that DIC value of second model is slighly descreased it means that second model is better than the first model. I will not going to perform diagnostics in this model because intended parameters values will be just slighly changed.

  1. check the ability of a fully Bayesian analysis to recover model parameters with data simulated from the model

Lets create a simulated data for our model estimates above to check how it is working so that we know that there is nothing wrong in the model itself as diagnostic check.

N = 606 # sample size 

set.seed(1873829)
# Simulate the tries size
sim.smoke <- sample(c(0,1), replace=TRUE, size=N) # creating smoke 

# Simulate the covariate (as you prefer)
sim.age <- sample(6:17, replace=TRUE, size=N)

beta = c(0.064,0.262 ,2.259,-0.193)
tau = 0.186 # standard diviation of tau
# Pick fixed values for the parameters of the model got from above model
sim.fev = 0
sim.mus = 0
for(i in 1:n){
    sim.mus[i] <- beta[1] + beta[2]* sim.age[i] + beta[3]*sim.smoke[i]+ beta[4]*sim.age[i]*sim.smoke[i]
}
  
#Simulate response according to the model
sim.fev = rnorm(N, sim.mus, tau )  #predicting values based on our previous model. 

sim.dat <- data.frame(age=sim.age, smoking= sim.smoke, fev=sim.fev)
head(sim.dat)
##   age smoking      fev
## 1  12       1 3.242332
## 2   6       0 1.124224
## 3  11       0 2.786389
## 4  11       1 3.149172
## 5  14       0 3.631531
## 6  11       0 3.057191

Same model there is a change of sim.model.jags.

sim.model.jags <- function()  {
  # Likelihood
  for(i in 1:n){
    fev[i] ~ dnorm(mu[i],tau)
    mu[i] <- beta[1] + beta[2]* age[i] + beta[3]*smoke[i]+ beta[4]*age[i]*smoke[i]
  }
  
  beta[1] ~ dnorm(0, 0.001)  # Diffuse prior
  beta[2] ~ dnorm(0, 0.001)
  beta[3] ~ dnorm(0, 0.001)
  beta[4] ~ dnorm(0, 0.001)
  tau ~ dgamma(0.001, 0.001)
}

Lets initialize the simulated model and run it.

# data that jags will use
sim.dat.jags <- list(n=N,fev=sim.fev,age=sim.age,smoke=sim.age)

# parameters of intrests
sim.mod.params  <- c("beta","tau") 

# Starting values
sim.mod.inits <- function(){
  list("tau" = 1, "beta" = c(0,0,0,0))
}
set.seed(1873829)

sim.mod.fit <- jags(data = sim.dat.jags,                                        # DATA
                model.file = sim.model.jags, inits = sim.mod.inits,                  # MODEL
                parameters.to.save = sim.mod.params,                  
                n.chains = 3, n.iter = 9000, n.burnin = 1000, n.thin=10) # MCMC
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 606
##    Unobserved stochastic nodes: 5
##    Total graph size: 1874
## 
## Initializing model
sim.mod.fit 
## Inference for Bugs model at "/var/folders/yy/6ydy4yyd44zclz1dgg2076lw0000gn/T//RtmphrLy0P/model639d9606104.txt", fit using jags,
##  3 chains, each with 9000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 2400 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
## beta[1]    1.439   0.189   1.079   1.309   1.437   1.570   1.811 1.000  2400
## beta[2]   -0.397  22.969 -46.269 -15.937  -0.009  15.331  44.406 1.001  2400
## beta[3]    0.529  22.969 -44.259 -15.196   0.137  16.045  46.416 1.001  2400
## beta[4]    0.001   0.001  -0.002   0.000   0.001   0.002   0.004 1.000  2400
## tau        6.700   0.382   5.979   6.440   6.699   6.955   7.485 1.001  2400
## deviance 567.899   2.835 564.341 565.811 567.252 569.381 575.020 1.001  2400
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 4.0 and DIC = 571.9
## DIC is an estimate of expected predictive error (lower deviance is better).

It seems like from above that DIC value descreased quite a bit also model was able to get the interested parameter. It is performing exceptionally welll on simulated data that model requires per its implementation.

Just to be sure lets perform coda heidel test

coda.fit <- coda::as.mcmc(sim.mod.fit)
coda::heidel.diag(coda.fit)
## [[1]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed       1         0.0662 
## beta[2]  passed       1         0.5688 
## beta[3]  passed       1         0.5677 
## beta[4]  passed       1         0.1500 
## deviance passed       1         0.6474 
## tau      passed       1         0.7647 
##                                       
##          Halfwidth Mean      Halfwidth
##          test                         
## beta[1]  passed      1.44112 0.013128 
## beta[2]  failed     -0.32697 1.732909 
## beta[3]  failed      0.45856 1.733017 
## beta[4]  passed      0.00114 0.000102 
## deviance passed    567.92646 0.191424 
## tau      passed      6.70859 0.026958 
## 
## [[2]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed       1         0.207  
## beta[2]  passed       1         0.700  
## beta[3]  passed       1         0.699  
## beta[4]  passed       1         0.377  
## deviance passed       1         0.142  
## tau      passed       1         0.324  
##                                      
##          Halfwidth Mean     Halfwidth
##          test                        
## beta[1]  passed    1.44e+00 1.26e-02 
## beta[2]  failed    1.05e-01 1.61e+00 
## beta[3]  failed    2.70e-02 1.61e+00 
## beta[4]  passed    1.14e-03 8.57e-05 
## deviance passed    5.68e+02 2.03e-01 
## tau      passed    6.69e+00 2.85e-02 
## 
## [[3]]
##                                        
##          Stationarity start     p-value
##          test         iteration        
## beta[1]  passed       1         0.0549 
## beta[2]  passed       1         0.8709 
## beta[3]  passed       1         0.8692 
## beta[4]  passed       1         0.0518 
## deviance passed       1         0.6644 
## tau      passed       1         0.2470 
##                                       
##          Halfwidth Mean      Halfwidth
##          test                         
## beta[1]  passed      1.43976 0.013474 
## beta[2]  failed     -0.96874 1.575041 
## beta[3]  failed      1.10041 1.575051 
## beta[4]  passed      0.00114 0.000103 
## deviance passed    567.84628 0.195409 
## tau      passed      6.69950 0.024182

Only halfwidth Mean test is failing. Lets to be sure we will perform one other diagnostic test.

coda::gelman.diag(coda.fit)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## beta[1]           1       1.01
## beta[2]           1       1.00
## beta[3]           1       1.00
## beta[4]           1       1.01
## deviance          1       1.00
## tau               1       1.00
## 
## Multivariate psrf
## 
## 1
coda::gelman.plot(coda.fit)

After performing the analysis we can say that our model was able to get the model parameters correctly based on simulated data.

  1. Comparative analysis with frequentist inference

We will create a linear model from the data and try to fit the parameters. Linear regressin is good choice for frequentistic approach. For this analysis we will get only two features age and smoke and do the same analysis we used in baysian.

smoke<-FEVdata$smoking
fev<- FEVdata$fev
age<- FEVdata$age
n <- length(fev)
lr <- lm(fev~age+smoke+age:smoke) # linear regression

# should display summary of the model 
summary(lr)
## 
## Call:
## lm(formula = fev ~ age + smoke + age:smoke)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.39397 -0.35731 -0.02474  0.31752  2.01363 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.06470    0.10087   0.641    0.522    
## age          0.26214    0.01000  26.213  < 2e-16 ***
## smoke        2.26181    0.48397   4.673 3.66e-06 ***
## age:smoke   -0.19292    0.03685  -5.236 2.28e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5536 on 602 degrees of freedom
## Multiple R-squared:  0.5569, Adjusted R-squared:  0.5547 
## F-statistic: 252.2 on 3 and 602 DF,  p-value: < 2.2e-16
beta <- coef(lr)
X <- cbind(rep(1,n),age,smoke,age*smoke)

# building getting data from model itself
rstudsmoke <- rstudent(lr)[smoke==1]
rstudnonsmoke <- rstudent(lr)[smoke==0]

We can observer that we get estimates of the parameters. Now we analyze our linear line with the dataset from which the model is generated. Fitted is a generic function which extracts fitted values from objects returned by modeling functions. Residuals extracts model residuals from objects returned by modeling functions.

par(mfrow=c(2,2))
plot(fitted(lr),resid(lr),xlab="Fitted Values", ylab="Residuals",main="", )
lines(lowess(fitted(lr),resid(lr)),col= c("red"))
qqnorm(resid(lr),main="")
qqline(resid(lr), col=c("red"))

Now we analyze our linear line with the dataset for smokers only

par(mfrow=c(2,2))
plot(fitted(lr)[smoke==1],rstudsmoke,xlab="Fitted Values", ylab="Studentized Residuals",main="Smokers \n Residual Plot")
lines(lowess(fitted(lr)[smoke==1],rstudsmoke), col= c("red"))
qqnorm(rstudsmoke,main="Smokers \n Normal Plot")
qqline(rstudsmoke,col= c("red"))

Now we analyize our linear line with the dataset for non-smokers only

par(mfrow=c(2,2))
plot(fitted(lr)[smoke==0],rstudnonsmoke,xlab="Fitted Values", ylab="Studentized Residuals",main="Nonsmokers \n Residual Plot")
lines(lowess(fitted(lr)[smoke==0],rstudnonsmoke), col= c("red"))
qqnorm(rstudnonsmoke,main="Nonsmokers \n Normal Plot")
qqline(rstudnonsmoke,col= c("red"))

Now we analyize our linear line with the dataset Age wise.

plot(age, rstudent(lr),xlab="Age", ylab="Residuals",main="")
lines(lowess(age,rstudent(lr)), col= c("red"))

Finally we are going to analyze our frequentistic model with model 2(More Accurate) which has less DIC value.

summary(lr)
## 
## Call:
## lm(formula = fev ~ age + smoke + age:smoke)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.39397 -0.35731 -0.02474  0.31752  2.01363 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.06470    0.10087   0.641    0.522    
## age          0.26214    0.01000  26.213  < 2e-16 ***
## smoke        2.26181    0.48397   4.673 3.66e-06 ***
## age:smoke   -0.19292    0.03685  -5.236 2.28e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5536 on 602 degrees of freedom
## Multiple R-squared:  0.5569, Adjusted R-squared:  0.5547 
## F-statistic: 252.2 on 3 and 602 DF,  p-value: < 2.2e-16
mod2.fit
## Inference for Bugs model at "/var/folders/yy/6ydy4yyd44zclz1dgg2076lw0000gn/T//RtmphrLy0P/model639d231dcd24.txt", fit using jags,
##  3 chains, each with 9000 iterations (first 1000 discarded), n.thin = 10
##  n.sims = 2400 iterations saved
##           mu.vect sd.vect    2.5%      25%      50%      75%    97.5%  Rhat
## beta[1]     0.064   0.102  -0.130   -0.006    0.063    0.134    0.265 1.000
## beta[2]     0.262   0.010   0.242    0.255    0.262    0.269    0.281 1.001
## beta[3]     2.259   0.484   1.306    1.943    2.244    2.576    3.229 1.001
## beta[4]    -0.193   0.037  -0.266   -0.216   -0.192   -0.168   -0.121 1.001
## tau         3.262   0.186   2.908    3.135    3.261    3.387    3.639 1.001
## deviance 1004.046   3.198 999.821 1001.664 1003.384 1005.712 1011.753 1.001
##          n.eff
## beta[1]   2400
## beta[2]   2400
## beta[3]   2400
## beta[4]   2400
## tau       2400
## deviance  2400
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.1 and DIC = 1009.2
## DIC is an estimate of expected predictive error (lower deviance is better).

We see that both models have given the output parameters and parameters in frequentist models are almost equal to mean of distribution of parameters in Baysean model. In baysean model we have fev as Normal distribution with variance as τ so baysian model also generates the variance of the fev in comparison to exact parameters in frequentist approach.